-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Travis CI #134
Use Travis CI #134
Conversation
Fun. I guess Travis CI was already enabled here. So, we can just wait and see if this works or not. :) |
If this works out, it'll be a nice way to vet changes before we merge. |
Ah, it's rebuilding from scratch because of the |
That's my hope. It may even dual as a mechanism for releases, which would reduce the maintenance burden.
I don't think that should be a problem. Unless, of course, So, I saw this line in the |
Yeah, we should add a |
Yes. It was a safeguard to ensure I and others didn't accidentally release from a local box. |
So, the build procedure might need to change a little bit. As the cache includes the entire repo, if something changed in some unrelated part, then the whole cache will still be invalidated and we need to build from scratch. Also, it appears that progress bars are causing are log to be too long for Travis. ( travis-ci/travis-ci#3034 ) |
Does travis cache anything across builds? In the manual process today, the "cache" comes from building on the same VM again and again so that the history just happens to be there. When the latest source is fetched from git, none of the local files are touched except those that changed in master. So only the images that truly had changes on master break cache. Everything else builds speedily. |
Would I be able to access the VM to take a look? I think it would help me understand what is going on a bit more. |
Added your github public keys to the VM. The instructions for sshing and navigating around are in the README https://github.com/jupyter/docker-stacks#maintainer-workflow. If you go there right now and run |
Perfect. Thanks. |
So, it looks like our copy of Admittedly, there will be times when our releases will be using an older copy of |
Even if this doesn't work, it looks like we will still get a few bug fixes. |
f2e0ddf
to
2d467a5
Compare
Well, unfortunately, I don't think this is going to work. We are running into the maximum time limit here. Could we maybe consider using Docker Hub's automated builds for |
The problem with the Docker Hub auto build is that there's no way to tag according to the SHA. We'd have to make a tag/branch here every time, and then manually tell Docker Hub to build those. |
Sure, that's reasonable. What about an internal image that includes the |
There were some merge conflicts, which have been resolved. Though I think it will be awhile still before we need to worry about merging this (if at all). |
I'm OK with merging it once it goes green and leaving it active as a way to test PRs, even if it's not tied to the release process. It beats having nothing at all while we continue to look for ways of automating the entire workflow. Plus, we'll start to get data on how many times it actually works on Travis vs fails because of timeouts or other problems. |
My concern is that the debian images gets updated so frequently that we will often have to rebuild everything from scratch through the CI and see these build failures. It would be nice to have a way to mitigate this. A totally different thought might be to create a "mirror" of the debian image. Namely, we add a
We could then just update this "mirror" image for releases. As it won't be updated between release, we won't have to deal with cache invalidation in the CI. Releases could continue to proceed through the VM as they do now. That all being said, there is a chance that some change to the debian image breaks our release. So, this system will not show that sort of problem. Though this is normally a rare enough event that it shouldn't be that much of an issue. Besides simply merging our currently proposed change to the CI system won't be able to catch this problem anyways due to the fact that a complete rebuild doesn't not fit in the time limit. Thoughts? |
To put this in perspective, our current manual build system mitigates this problem by keeping debian cached on the build VM. It's nice and speedy, but the downside is that when debian is updated for good reason (e.g., CVE) unless we know to re-pull it, we're building off a base image with known defects. There's no silver bullet here. We need to pick what we're willing to trade off: security/fixes to the base for build speed or vice versa. I honestly think "we're doing it wrong" today. We should be providing users with the most secure / bug-free images we can, not optimizing for what's convenient to run on Travis or the current build VM.
I feel like we're catering to limitations of the free CI systems in doing this. But that's just my opinion. |
So, I think there are two different issues and it would be better if we looked at them separately.
My primary issue at present is to solve 1. Solving 2 would be nice, but IMHO it is secondary to solving 1. The reason I think 1 is more important is it is hampering our development cycle. Having reliable automated testing gives us some confidence in changes proposed to the stacks. It also cuts down on developer time spent verifying these changes. Also, it gives contributors feedback sooner. Once we have resolved the bottleneck in 1 we can focus on 2. However, if we focus on 2 now 1 remains a bottleneck and so it doesn't really do us any good. If our current solution is to keep the debian image cached on the VM and that is how we are doing deployments, then I think having our testing infrastructure mimicking that is reasonable. To my understanding, that is what my proposal is above. As clarification on another point, I don't think we should worry about doing releases with Travis. We should stick to the VM at present. Though we certainly can revisit this if we find we have a configuration of Travis that is reliable for automated releases. In the long term we do need to have a discussion about fixing our testing and release system (and I think you have raised some valid concerns). Though I think that pertains to point 2. As stated before making sure 1 is being done reliably will speed up our development time and allow us to focus more on 2. |
I agree that we should be tackling #1 here.
Sorry. I'm losing track. I'm assuming this is the proposal:
How about instead we use a digest reference in the minimal-notebook Dockerfile to pin to a specific debian image instead of the tag (which floats across rebuilds) or having our own internal image (which adds maintenance burden.)
When there are rebuilds of debian with critical security fixes, we do a PR with a new digest. |
Sure, that sounds fine. I'll update this PR to reflect that change. |
This is done to more explicitly track what version of Debian Jessie is being used as a base image. It will also ensure that it is properly updated on the VM even if we forget. This also should help CI and VM builds stay speedy by using the cache even when there is a newer version of Debian Jessie. In the long run, we may wish to re-evaluate this strategy and fix our CI and deployment systems so as to be able to use the latest version of Debian Jessie with important CVE and other fixes.
How does this look? |
It looks like the right stuff, but travis doesn't seem to think so. It used cache up to the line that installs miniconda, and at that point, it started building the layers from scratch. |
I added quiet flags to every conda command. This was done because the logs were exceeding a maximum log size on Travis (~4MB). As a result, it is doing the right thing and tossing the cache during the conda install as the quiet flag is new. We could do another release after merging this. Alternatively, I could backport the quiet flags and we could release that. Either should help resolve this issue for future PRs. |
Ah, you mean this hasn't been rebased? Go ahead and rebase it. We can kill and restart the build. Let's confirm that it works as expected before merging. |
@@ -70,11 +71,11 @@ RUN cd /tmp && \ | |||
echo "9ea57c0fdf481acf89d816184f969b04bc44dea27b258c4e86b1e3a25ff26aa0 *Miniconda3-3.19.0-Linux-x86_64.sh" | sha256sum -c - && \ | |||
/bin/bash Miniconda3-3.19.0-Linux-x86_64.sh -f -b -p $CONDA_DIR && \ | |||
rm Miniconda3-3.19.0-Linux-x86_64.sh && \ | |||
$CONDA_DIR/bin/conda install --yes conda==3.19.1 && \ | |||
$CONDA_DIR/bin/conda install --quiet --yes conda==3.19.1 && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is why the Miniconda install line cannot come from cache. The change is not currently in master
.
Sorry, I was a bit unclear. Above is an example of the quiet flag added in this PR that was not in |
Ah. My bad. Wrong way. OK. Let's let the build go as far as it can, then we can merge. At that point, we can rebuild the image son the build VM because there were indeed changes (albeit superficial ones for the quiet flag). After that build/push, we can kick travis to have it rebuild again and see what happens. |
No worries. I had forgotten about the quiet flag busting the cache. Ok, sounds good. |
Appears to have passed. :) |
Alright, its building on the VM under tmux release. Might need your help with the wiki though. |
The webhook automatically updates the build page on the wiki when the last image (currently all-spark-notebook) finishes pushing successfully. Nothing you should need to do . |
Thanks for clarifying. |
Not a problem. I've added a similar note to the maintainer section of the root README as well. |
Sounds good. Also, it looks like we are getting a little close to the disk space limit on the VM. I cleared out some dangling images from yesterday's build. Though we probably should clear out some more space. |
Released. Wiki is updated. |
If we decide to go the VM route in the future, conveyor might be worth checking out. |
Yeah, I just came across it last night. Figured it would be better than us learning how to setup GitHub status updates and webhooks from scratch. |
Making possible to specify custom claims for username in jupytherhub_config.py
Related: #7
This doesn't include releases or anything else at this point. All this does is tries to use Travis CI to do the builds. It also makes sure all images have been refreshed so that hopefully
Dockerfiles
that were left unchanged and don't have changed dependencies won't be rebuilt.